Using Model-Based Clustering to Improve Predictions for Queueing Delay on Parallel Machines

نویسندگان

  • John Brevik
  • Daniel Nurmi
  • Richard Wolski
چکیده

Most space-sharing parallel computers presently operated by production high-performance computing centers use batch-queuing systems to manage processor allocation. In many cases, users wishing to use these batch-queued resources may choose among different queues (charging different amounts) potentially on a number of machines to which they have access. In such a situation, the amount of time a user’s job will wait in any one batch queue can be a significant portion of the overall time from job submission to job completion. It thus becomes desirable to provide a prediction for the amount of time a given job can expect to wait in the queue. Further, it is natural to expect that attributes of an incoming job, specifically the number of processors requested and the amount of time requested, might impact that job’s wait time. In this work, we explore the possibility of generating accurate predictions by automatically grouping jobs having similar attributes using model-based clustering. Moreover, we implement this clustering technique for a time series of jobs so that predictions of future wait times can be generated in real time. Using trace-based simulation on data from 7 machines over a 9-year period from across the country, comprising over one million job records, we show that clustering either by requested time, requested number of processors, or the product of the two generally produces more accurate predictions than earlier, more naive, approaches and that automatic clustering outperforms administratordetermined clustering.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

SCHMIB: Segregating Clusters Hierarchically Making Improved Bounds

Most space-sharing parallel computers presently operated by high-performance computing centers use batch-queuing systems to manage processor allocation. In many cases, users wishing to use these batch-queued resources have the option of choosing between different queues (having different charging rates) potentially on a number of different machines where they have access. In such a situation, t...

متن کامل

Heuristic approach to solve hybrid flow shop scheduling problem with unrelated parallel machines

In hybrid flow shop scheduling problem (HFS) with unrelated parallel machines, a set of n jobs are processed on k machines. A mixed integer linear programming (MILP) model for the HFS scheduling problems with unrelated parallel machines has been proposed to minimize the maximum completion time (makespan). Since the problem is shown to be NP-complete, it is necessary to use heuristic methods to ...

متن کامل

Fuzzy Programming for Parallel Machines Scheduling: Minimizing Weighted Tardiness/Earliness and Flow Time through Genetic Algorithm

Appropriate scheduling and sequencing of tasks on machines is one of the basic and significant problems that a shop or a factory manager encounters; this is why in recent decades extensive studies have been done on scheduling issues. One type of scheduling problems is just-in-time (JIT) scheduling and in this area, motivated by JIT manufacturing, this study investigates a mathematical model for...

متن کامل

Fuzzy Programming for Parallel Machines Scheduling: Minimizing Weighted Tardiness/Earliness and Flowtime through Genetic Algorithm

Appropriate scheduling and sequencing of tasks on machines is one of the basic and significant problems that a shop or a factory manager encounters with it, this is why in recent decades extensive researches have been done on scheduling issues. A type of scheduling problems is just-in-time (JIT) scheduling and in this area, motivated by JIT manufacturing, this study investigates a mathematical ...

متن کامل

Multi-Objective Unrelated Parallel Machines Scheduling with Sequence-Dependent Setup Times and Precedence Constraints

This paper presents a novel, multi-objective model of a parallel machines scheduling problem that minimizes the number of tardy jobs and total completion time of all jobs. In this model, machines are considered as unrelated parallel units with different speeds. In addition, there is some precedence, relating the jobs with non-identical due dates and their ready times. Sequence-dependent setup t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Parallel Processing Letters

دوره 17  شماره 

صفحات  -

تاریخ انتشار 2007